Main
Alexandre Henrique S. Dias
I’m a Full-Time Data Scientist and a MSc student in Electrical and Computer Engineering at UFRN. My research is focused on social network analysis, graph theory, and Natural Language Processing. Additionaly, the main programming languages I use are Python, R, C++, and SQL. Besides, my favorite ML FrameWorks are Scikit-Learn and TensorFlow. Lastly, I also have skills in MLOps using GKE, Kubeflow, Kubernetes, and Docker.
Industry Experience
Data Scientist
Americanas S.A.
São Paulo, SP
Present - 2021
- Responsible for building ML models using: Python, Scikit-Learn, and Tensorflow. Apply ML to a wide range of topics, such as Complex Network Analysis, Social Networks, NLP, and HR Analytics.
- Create ML pipelines using KubeFlow Pipelines from Google Cloud AI Platform, and participate in the design of CI/CD operations of ML models.
Data Scientist
Looqbox
São Paulo, SP
2021 - 2019
- Development of BI reports and dashboards using R, Python, and SQL.
- Maintainer of the Looqbox R Package used to build R objects and data structures compatible with the Looqbox Application.
Education
M. Sc., Electrical and Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
Present - 2021
- Research in complex network analysis, social networks, graph theory, and NLP.
- Tools: python, networkX, gephi, TensorFlow, WandB, Git.
MITx Micromaster Program in Statistics and Data Science
MITx on EdX
EdX
2022 - 2020
- The MITx MicroMaster Program in Statistics and Data Science covers the fundamentals of data science, statistics, and machine learning.
B. Sc., Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2019 - 2018
- Researcher and member of the Modeling and Scientific Data Analysis team.
B. Sc., Sciences & Technology
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2017 - 2015
- Linear Algebra and Analytical Geometry Teacher Assistant.
- Calculus II Teacher Assistant.
Certificates & Courses
MicroMasters in Statistics and Data Science
MITx on EdX
N/A
2022 - 2020
- 6.431x: Probability - The Science of Uncertainty and Data.
- 18.6501x: Fundamentals of Statistics.
- 6.86x: Machine Learning with Python - From Linear Models to Deep Learning.
- 14.310x/Fx: Data Analysis in Social Science.
- DS.CFx: Capstone Exam for Statistics and Data Science.
MLOps (Machine Learning Operations) Fundamentals
Coursera
N/A
2021
DataCamp completed tracks
DataCamp
N/A
2019 - 2018
- Data Scientist with Python.
- Data Analyst with Python.
- Data Manipulation with Python.
- Machine Learning with Python.
- Importing & Cleaning Data with Python.
- Python Programming.
- Python Programmer.
Academic Publications
Paper published in the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT). Naples, Italy.
Performance Evaluation of an Edge OBD-II Device for Industry 4.0
Institute of Electrical and Electronics Engineers
IEEE
2019
- Performance evaluation of an Edge OBD-II device that collects data from vehicles in an autonomous way in order to provide customer feedback and tracking
Research Experience
Undergraduate Researcher
Digital Metropolis Institute
UFRN
2019 - 2018
- Developed a traffic monitoring system using image recognition techniques.
Undergraduate Researcher
Department of Informatics and Applied Mathematics
UFRN
2017 - 2016
- Developed an interactive theorem prover based on Linear Logic using the Maude programming language.
Selected Data Science Writing
I enjoy reading about productivity, lifestyle, data science/AI, and statistics.
Dimensionality Reduction with Factor Analysis on Student Performance Data
N/A
2021
- A dimensionality reduction technique with interpretable outputs.
Stop Using the Elbow Method
N/A
2021
- Silhouette Analysis: A more precise approach to finding the optimal number of clusters using K-Means.
Scikit-Learn 1.0 - A true milestone
N/A
2021
- An overview of the design principles of Scikit-Learn and how the famous ML library became so popular.
The Expectation-Maximization (EM) Algorithm
N/A
2021
- Understanding the motivations and how the EM Algorithm works.
A mathematical derivation of the Law of Total Variance
N/A
2020
- Understanding what is and when to apply the Law of Total Variance.
Clustering with K-means: simple yet powerful
N/A
2019
- Explain what is Cluster Analysis, and how the K-means algorithm work providing its pros and cons.
An introduction to Linear Regression
N/A
2019
- Explain all assumptions behind Linear Regression, how to measure its performance, and how to implement it in Python.